The Message Bus is the backbone of the BizTalk Server
product. The bus contains unique parts, each of which are explained
later in the subsection "Messaging Components."
The most obvious of these is the Messagebox, which is explained first.
The others include the messages within the Messagebox and the messaging
components that move messages to their proper endpoints.
1. The Messagebox
The Messagebox is simply a
database. This database has many tables, several of which are
responsible for storing the messages that are received by BizTalk. Each
message has metadata associated with it called the message context, and the individual metadata items are stored in key/value pairs called context properties. There are context properties that describe all the data necessary to identify elements such as the following:
The inbound port where the message was received from.
The inbound transport type.
Transport-specific information such as ReceivedFileName in the case of the file adapter, InboundQueueName in the case of MSMQ or MQSeries, and so on.
Autogenerated internal MessageID of the message so it can be uniquely identified.
The
schema type and namespace of the message, assuming it is an XML message
using namespace#root as the message type. A common misconception is
that the message type is always composed of the namespace and the root
node name. In fact, this is not necessarily so. Richard Seroter,
Microsoft MVP, explains this in his blog in greater detail (http://seroter.wordpress.com/2009/02/27/not-using-httpnamespaceroot-asbiztalk-message-type/).
Many people generally equate the
Messagebox to be the whole of the BizTalk Server messaging
infrastructure. This is absolutely false and is similar to saying that a
database is basically a set of data files sitting on a hard drive. The
messaging infrastructure, or Message Bus, consists of a dozen or so
interrelated components, each of which performs a specific job.
2. Messaging Components
When new architects
start designing BizTalk solutions, few stop to think about how the
messages are actually going to be sent and received to their proper
endpoints. This job belongs to the messaging components within BizTalk,
each of which is explained next.
2.1. Host Services
A BizTalk host
is nothing more than a logical container. Hosts provide you with the
ability to arrange the messaging components of your application into
groups that can be distributed across multiple memory processes and
across machines. A host is most often used to separate adapters,
orchestrations, and ports to run on separate machines to aid in load
balancing. A host instance
is just that, an instance of the host. The instance is actually just a
service called BTSNTSvc.exe that runs on the machine. This process
provides the BizTalk engine with a place to execute and allows for
instances of different hosts to be running on one machine at a given
time. Each host instance will end up being a separate instance of the
BTSNTSvc.exe service from within the Windows Task Manager. If you
examine the Windows Services control panel applet, you will find that
each of the hosts that is configured on the machine will show up as a
separate service named whatever the host was originally called. The host
instance exists simply to allow the BizTalk subservices a place to run.
Most people think of the BizTalk service as a single unit, but really
it is a container for multiple services, each of which is described in
the following text.
The difference between an Isolated host and an In-Process host
is that an Isolated host must run under another process, in most cases
IIS, and an In-Process host is a complete Biz-Talk service alone.
Additionally, since Isolated hosts exist outside of the BizTalk
environment, the BizTalk Administration Tools are not able to determine
the status of these hosts (stopped, started, or starting). Security is
also fundamentally different in an Isolated host versus an In-Process
host. In-Process hosts must run under an account that is within the
In-Process host's Windows group, and they do not maintain security
context within the Messagebox. For Isolated hosts, you normally create a
separate account with minimum permissions since Isolated hosts in most
cases receive messages from untrusted sources such as Internet. Isolated
hosts are useful when an external process that will be receiving
messages either by some proprietary means or by some other transport
protocol such as HTTP already exists. IIS is a good example of such a
process. In such cases, the Isolated host runs only one instance of the
End Point Manager and is responsible for receiving messages from its
transport protocol and sending them to the Messagebox through the EPM.
Outside of hosting an IIS process, Isolated hosts could be used to
attach to a custom Windows service that is polling a message store
looking for new items that it will publish to the Messagebox. Isolated
processes provide an architectural advantage for these scenarios. They
do not require any interprocess communication (IPC) between the EPM and
the Windows service that hosts it. The only real IPC that exists between
the Isolated host and the Messagebox database is a database service,
hosted most likely on another machine.
In-Process hosts can host
all BizTalk subservices depending on how they are configured. They not
only can receive messages from the outside world, but they can send them
through Send Adapters, poll for messages that match a subscription, and
host XLANG engine instances. In the case of a Send Adapter, an
In-Process host must be used because of how the security context of the
Adapter Framework is built. To use adapters with Isolated hosts, the
adapters have to use custom IPC. HTTP and SOAP adapters use this
technique to interact with aspnet_wp.exe/w3wp.exe processes. Each
Isolated host has the set of subservices running within it shown in Table 1. These services can also be viewed from the adm_HostInstance_SubServices table in the Management Database.
Table 1. Host Instance Subservices
Service | Description |
---|
Caching | Service
used to cache information that is loaded into the host. Examples of
cached information are assemblies that are loaded, adapter configuration
information, custom configuration information, and so on. |
End Point Manager | Go-between
for the Message Agent and the Adapter Framework. The EPM hosts
send/receive ports and is responsible for executing pipelines and
Biz-Talk transformations. |
Tracking | Service that moves information from the Messagebox to the Tracking Database.
|
XLANG/s | Host engine for BizTalk Server orchestrations. |
MSMQT | MSMQT
adapter service; serves as a replacement for the MSMQ protocol when
interacting with BizTalk Server. The MSMQT protocol has been deprecated
in BizTalk Server 2006 and should be used only to resolve
backward-compatibility issues. |
2.2. Subscriptions
To fully understand the Message Bus architecture, it is critical to understand how subscriptions work and what enlisting
is. Subscriptions are the mechanism by which ports and orchestrations
are able to receive and send messages within a BizTalk Server solution.
Each BizTalk process that runs on a machine has something called the Message Agent,
which is responsible for searching for messages that match
subscriptions and routing them to the End Point Manager (EPM), which
actually handles the message and sends it where it needs to go. The EPM
is the broker between the Messagebox and the pipeline/port/adapter
combination. Orchestration subscriptions are handled by a different
service called XLANG/s. These services are executed within the
BTSNTSvc.exe process that runs on the host
2.2.1. Subscribing
According to Microsoft, "A
subscription is a collection of comparison statements, known as
predicates, involving message context properties and the values specific
to the subscription." Predicates are inserted into one of the Messagebox's predicate tables,
based on what type of operation is specified in the subscription being
created. Note the list of predicate tables that follows; these are the
same predicates that are used in the filter editor for defining filter
criteria on ports. The reason the list of tables is the same as the list
of filter predicates is because a filter expression is actually being
used to build each subscription. When you are defining a filter
expression, what you are actually doing is modifying the underlying
subscription within BizTalk to contain the new filter information that
is included in your filter expression.
The BizTalk services create a subscription in the Messagebox by calling two stored procedures. These are bts_CreateSubscription_{HostName} and bts_InsertPredicate_{HostName}.
The subscription is created based on which host will be handling the
subscription, which is why these stored procedures are created
automatically when the host is created in the Microsoft Management
Console.
2.2.2. Enlisting
Most people ask what the difference is between enlisting a port and starting
a port. The difference is simple. Enlisted ports have subscriptions
written for them in the Messagebox, while unenlisted ports do not. The
same is true for orchestrations. Artifacts that are not enlisted are
simply in "deployment limbo" in that they are ready to process messages
but no way exists for the Messaging Engine to send them one. The main
effect this will have is that ports and orchestrations that are
enlisted, but not started, will have any messages with matching
subscription information queued within the Messagebox and ready to be
processed once the artifact is started. If the port or orchestration is
not enlisted, the message routing will fail, since no subscription is
available and the message will produce a "No matching subscriptions were
found for the incoming message" exception within the Event Log. You
have to be aware of a common and potentially risky situation when you
have more than one subscriber for a particular message type. In such
cases, if the published message routed to at least one of the
subscribers, unenlisted offenders would never get the message, and
moreover no error would be raised since the message satisfied another
subscriber.
When a port is enlisted, the Message Agent will create subscriptions for any message whose context property for TransportID matches the port's transport ID. For orchestrations, it also creates the subscription based on the MessageType
of the message that is being sent to the port within the orchestration.
Binding an orchestration port to a physical send port will force the
EPM to write information about that binding to the Management Database.
Should the orchestration send messages through its logical port to the
physical port, it will include the transport ID in the context so that
the message is routed to that specific send port.
The next point is related to
the pub/sub nature of the Message Bus. Since any endpoint with a
matching subscription can process the message once it is sent from an
orchestration to the send port, it is possible for multiple endpoints to
act upon that message. This is critical to understand. Sending a
message through an orchestration port to a bound physical port simply
guarantees that a subscription will be created so that the message is
routed to that particular endpoint. There is nothing that says no other
subscriber may also act on that message. Most developers often overlook
this point. Most people assume that since the port is bound, it simply
ends up at the correct send port by magic. In reality, all that is
happening is that the Message Agent is writing a subscription that
hard-codes the context properties of that message so that it will always
end up at least
at that particular send port. Sending the message through the send port
simply publishes the message in the Messagebox, and the engine and
subscriptions take care of the rest so that you won't have to publish a
message over and over again in order to reach multiple targets.
2.3. Messages
A message within BizTalk is
more than just a direct representation of the document received from the
outer world. BizTalk has a model where messages contain both data and
context. Understanding how messages are stored internally within the
Messagebox is crucial to understanding how to architect systems that
take advantage of how the product represents messages internally.
2.3.1. What Is a Message?
A message
is a finite entity within the BizTalk Messagebox. Messages have context
properties and zero-to-many message parts. Subscriptions match
particular context properties for a message and determine which
endpoints are interested in processing it. As mentioned before, there is
one critical rule that will never change:
Messages are immutable once they are published.
Many people who have worked
with BizTalk for years do not fully understand this rule. A message
cannot be changed once it has reached the Messagebox. At this point most
developers would say rather proudly, "But what about a pipeline
component? I can write a pipeline component that modifies the message
and its payload along with the context properties, right?" The answer to
this question is already in the request. Modifying the message can be
done only in a pipeline, either sending or receiving. A receive pipeline
modifies the message before it gets to the Messagebox. At the end of
the pipeline, the message is published. A send pipeline operates on the
message after it leaves the Messagebox and before it is sent out. The
original message is still unmodified in the Messagebox database
regardless of what the send pipeline decides to do with the message.
2.3.2. Messages vs. Message Parts
Messages are composed of zero or more message parts. All messages with parts generally have a part that is marked as the body
part. The body part of the message is considered to contain the data or
"meat" of the message. Many adapters will examine only the body part of
the message and ignore any other parts in case of multipart messages.
These are the messages containing more than one document. A multipart
message can have one "body" and any number of additional parts. The
closest analogy is an email message with attachments. If you look at the
Messagebox database, there are two specific tables, one that holds all
messages that flow through BizTalk and one that holds all the message
parts. This zero-to-many relationship implies something—message parts
can be reused in multiple messages. And that is absolutely true. Each
message part has a unique part ID that is stored in the MessageParts
table and is associated with the message ID of the main message. It is also
important to understand that message parts contain message bodies, which
are generally XML based. If a message is received on a port that uses a
pass-through pipeline, then the message can be anything including
binary data. When using a pass-through pipeline, a Receive Adapter
stamps its values into message context, but no properties can be
promoted from the data of the message. If you think about it, this is
obvious. In the case where you are accepting binary data, BizTalk has no
mechanism to examine the message body part and determine the message
type, so how can it promote it? In this case, the message will contain
one message part whose message body is a stream of binary data.
2.4. Message Context Properties
Message context properties are defined in what is called a property schema. The properties themselves are then stored into context property bags. The context property bags are simply containers for the properties, which are stored as key/value pairs.
2.4.1. Context Property Schemas
The schema of the inbound
message is used by BizTalk Server to associate it with any corresponding
property schemas. There is a global properties schema every message can
use by default that contains system-level properties. It is possible to
create custom properties schemas that can define application-specific
and typically content-based properties that may be required such as an
internal organizational key, the customer who submitted the document,
and so on.
System-level properties
defined within global property schemas are essentially the same as
custom context properties defined within a custom property schema. Both
types have a root namespace that is used to identify the type of
property, and both are stored within the context property bag for a
given message. In reality there is no real difference to the runtime in
terms of whether a context property is a "system-level" property or a
"custom" property.
Context properties, whether
they are system or custom properties, define part of the subscription
that is used to evaluate which endpoint(s) have a valid subscription to
the message. The most common message subscription is based on the
message type. BizTalk typically identifies the message type in the
message context as a combination of the XML namespace of the message
along with the root node name plus the #. For example, say that you had a document with the declaration in Listing 1.
Example 1. XML Order Request Sample Document
<ns0:Request xmlns:ns0="http://schemas.abccompany.com"> <Header> <ReqID>4</ReqID> <Date>6/6/2005</Date> </Header>
<Item> <Description>Description_0</Description> <Quantity>10</Quantity> <UnitPrice>2</UnitPrice> <TotalPrice>2</TotalPrice> </Item> </ns0:Request>
|
The BizTalk message type in this example would be http://schemas.abccompany. com#Request.
For message type-based subscriptions, the subscription would then be
evaluated by the Message Agent to determine whether any endpoints have
subscriptions for the message in question. The list of all subscriptions
can be viewed within the BizTalk MMC snap-in tool by viewing all the
subscriptions within the solution. Figure 1
shows that each of the message properties can be viewed within the
BizTalk Administration Console and selected in the message properties
drop-down list, which can be used to search for messages within the
tool.
NOTE
The message context
properties will only be available if the XML or flat-file pipelines
were used. If the pass-through pipeline processed the message, no
properties would be available for searching in the BizTalk
Administration Console.
Using subscriptions to route documents to the proper endpoints is called content-based routing (CBR).
Having a thorough understanding of the pub/sub nature of the BizTalk
Message Bus is crucial when designing any large messaging-based
application, especially in situations where there is going to be
significant amounts of routing between organizations and trading
partners.
2.4.2. The Context Property Bag
As stated previously, context properties are simply key/value pairs stored in an object that implements the IBasePropertyBag interface. As you can see in the following code and in Table 2, the definition of the interface is quite simple:
<Guid("fff93009-75a2-450a-8a39-53120ca8d8fa")>
<InterfaceType(ComInterfaceType.InterfaceIsIUnknown)>
Public Interface IBasePropertyBag
Table 2. IBasePropertyBag Interface Definition
Public Properties | -- |
CountProperties | Gets the number of properties in the property bag |
Public Methods | -- |
Read | Reads the value and type of the given property in the property bag |
ReadAt | Reads the property at the specified index value in the property bag |
Write | Adds or overwrites a property in the property bag |
Given that the context
property bag is such a simple structure, it is possible to use the
BizTalk API to write any property you want into the property bag. Note
that this does not require the property be promoted. Writing a property
into the property bag does not mean it is promoted and available for
message routing. If a value needs to be visible to the Message Bus for
routing purposes, it has to be promoted. By using the property schema to
promote a property, either by using a custom property schema or by
promoting a value into a property defined in the global property
schemas, what you are doing is first writing the value into the property
bag and then marking it as promoted. When adding context values from
within a pipeline component, you should be aware that there are
different API calls for simply writing properties and actually promoting
them.
It is critical to
understand that everything that is written to the property bag is
visible within the MMC. Likewise, it is quite easy to view the
subscription information for any ports that route on context properties.
If you are promoting properties into the message context, make sure
that they do not contain any sensitive data.
For example, if you have a field in a schema that contains credit card
numbers, do not promote this value without taking precautions. If you do
store the credit card information in a schema, make sure to make it sensitive
within the schema definition. This will cause the BizTalk runtime to
throw an error should that element's value be promoted. If it is
absolutely necessary to promote this value, make sure you encrypt it
using a third-party tool. |